klotz: movies* + imdb* + tf-idf* + lda* + cosine similarity*

0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag

  1. ow can you learn about the underlying structure of documents in a way that is informative and intuitive? This basic motivating question led me on a journey to visualize and cluster documents in a two-dimensional space. What you see above is an output of an analytical pipeline that begin by gathering synopses on the top 100 films of all time and ended by analyzing the latent topics within each document. In between I ran significant manipulations on these synopses (tokenization, stemming), transformed them into a vector space model (tf-idf), and clustered them into groups (k-means). You can learn all about how I did this with my detailed guide to Document Clustering with Python. But first, what did I learn?

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: Tags: movies + imdb + tf-idf + lda + cosine similarity

About - Propulsed by SemanticScuttle